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ABSTRACT 

The Sequential Unconstrained Minimization Technique (SUMT) for Convex 
Programming Problems is modified by the introduction of an exponent in 
the penalty term. The exponent is introduced to increase the rate of 
convergence of the method for nonlinear problems with solutions on the 
boundary of one or more constraints. Convergence to the solution of the 
constrained problem is proved, and it is shown that SUMT is a special case 
of the general unconstrained function with the exponent equal to one. 
Results of a sample problem indicate that the rate of convergence is 
improved and that the computational time for solution is decreased for an 


exponent less than one. 
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i Introduction. 

Since 1951, when Kuhn and Tucker extended the method of Lagrange 
multipliers to include inequality constraints, the techniques used for 
optimization of nonlinear problems have developed rapidly. The growth of 
this mathematical tool in the last decade is due to its successful appli- 
cation to many Military and Industrial problems. Section 2 describes 
the general convex programming problem and the theorems and conditions 
that assure convergence. Section 3 outlines one iterative method, the 
so called Sequential Unconstrained Minimization Technique, hereafter call- 
ed SUMT. 

SUMT converts a constrained convex programming problem, i.e., minimize 
f(x), subject to g(x) = 0, i = 1,2,...m, to an unconstrained problem; 
minimize f(x) + re [giG) ,P> oO . A second order gradient method is 
used to minimize the unconstrained function for a fixed value of Ir , 
then Ff is reduced and the procedure is repeated, until the solution of 
the constrained f(x) is approximated. 

In section 4, the unconstrained problem of SUMT is modified to the 
form f(x) + ré (pic) Aa aaa Vv 0. The proof of convergence of 
the modified SUMT is in section 4.1. SUMT was modified to increase the 
rate of convergence for convex programming problems with solutions on the 
boundary of one or more constraints. 

The effect of the parameter v on the unconstrained problem and on 
its gradient is analyzed in section 5. The increase in the rate of 


convergence of SUMT is shown for a sample problem. 


a The Convex Programming Problem. 

The problem of optimizing a function, subject to constraints, occurs 
frequently in industry, economics, and in pure and applied mathematics. 
The development of the high speed, digital computer in the late forties 
of this century, has generated a new interest in optimization problems 
that were too complicated or time consuming for hand computation. 

In the general case we wish to solve a problem of the form; find an 


n-dimensional rector x = (x); x 4X) that maximizes or minimizes the 


? 
objective function f(x), subject to the constraints g, (x) = 0. Additional 
constraints such as, all x, = 0, or combinations thereéf may be required 
and in this paper will be considered absorbed into the g, (x) 's. 


In 1947 George Daveniel = 


‘devised ‘the: simplex algorithm for « 
solving the general linear programming problem. That is, a problem of 
the form, optimize f(x) = = Cj % , subject: to By CX) - Fay “  ,iel)...m, 

3 ow 
where Cy, and &// are i, constants. The simplex ane is capable 
of solving linear problems with several hundred variables and/or constraints. 
The majority of the practical problems solved by the simplex method are 
linear approximations of »monlinear ones, and considerable effort has 
been expended to find a direct method of solving nonlinear programming 
problems. 

To date no general method has been found for nonlinear programming; 
however, many special methods exist for solving particular types of non- 
linear problems. The programmer, faced with a nonlinear optimization 
problem, must decide, based on his knowledge of the functions and the 


accuracy desired, whether to use a linear approximation and the simplex 


method or try to fénd a nonlinear method which solves his problem. 


The method of Lagrange multipliers provides the classical approach 


[10] 


to optimization problems. Let f(x) be the objective function to be 


optimized, subject to g, (x) = 0, i=41,2,...m, then form the function: 
L(x, A) = FOO + MO) + BODE + ot dy Fol 
where ; are constants, 
then differentiate [| (Xx , A ) with respect to x. and Ns and set 


each equal to zero: 


JL GA) “ ALMA _ 
JX, ae A Ay 


d L(x,d) a PL(MA) Lg 
Py x," i 


ILD) gh Ga) 
JX, i 7 


Qe 
7™™' Fe? 


Lagrange discovered that if a vector x is a solution to the optimization 
problem it will also satisfy the above n +m equations. The Aj 's 

are known as the Lagrange multipliers and are interpreted in economic prob- 
lems as the ''shadow spices ee The method of Lagrange is of great theo- 
retical value but unfortunately is not very useful in practice and all 

but very simple problems can be solved easier by other methods. 

In 1951 Kuhn and TuckoMe’ wenevelized the theory of Lagrance multi- 
pliers to inequality constraints and non-negative variables. Before 
discussing the Kuhn-Tucker theorem it is necessary to provide some defini- 
tions: 


[10] 


Convex Function : The function f(x) is said to be convex over 
a convex set X in E"™ if for any two points x) and Xo in X and 
for all d, C= Ae 1, 


oh AX +(1-A)%,] s hee. + ({-A) £ (x,) 


$(x) 


convex ! ; concave 


a B 


Figure [a 


Sq) 


Global Maximum 





Figure Ib 


Concave Function Lhe The function f(x) is said to be concave 


over a convex set in X in E , if for any two points xy and Xo 
in X, and for alld ,.0 <A # 1. 


FED. FINK] 2 AS (4.) t (1-A) S$) 

The function shown in figure Ia is convex over the interval Afx = B, 
and concave over the interval B £ x £ C; however, it is neither convex 
nor concave over the interval A€ x€£ C. 

An equilivalent definition of convex function is: if f(x) is twice 


continuously differentiable then the matrix of second partials of f(x), 





i.e., 





2 
“eral » is positive semi-definite, (negative semi-definite 
erred, 
for concave functions). 
Note: if f(x) is convex, then —f(x) is concave. 


Strictly Convex (concave): A function f(x) is strictly convex 
(concave) if only the inequality in the above definition holds. 


It is obvious from the definitions that, 
a) a linear function is both concave and convex, 


b) the sum of concave (convex) functions is a concave (convex) 
function. 


Global Maximum: The function f(x) defined over a closed set X 
in E® is said to take on a global maximum over X at the point x* 
if f(x) <= f(x*) for every point x € X. 

Local Maximum: The function f(x), defined at all points in a 

§ -neighborhood of x* in E™, is said to take on a local 
maximum at x*, if for all x in the § -~neighborhood, i.e., 
jxk-x]| < § , f(x) 4 £(x*). 

The definitions of a global minimum and local minimum are obtained 
by reversing the inequalities. The function shown in Figure Ib has, for 
A =x £E, a global maximum at B and a local maximum at D, two local 
minima at A and E, and a global minimum at C. 

The convex programming problem can now be stated: 

Problem A. 


Minimize a continuously differentable convex function 
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£ (x) 
Subject to the constraints 
g(x) = 0 i= 1,2,...m 
where each constraint is continuously differentable and concave. 
The desirable feature of the above restrictions is that a local 
minimum is also a global minimum. 
Problem B. 
Form the Lagragian function 
AHA) = $0) + ZA 9;,0) 


then find the vectors X and d such that: 
SOW IMs Gy aes Ge ») 


for all X20;A29 
te ee 0, hj > 0, where x, and iA are the components of the 
vectors x and d 


1) 


Kuhn and Tucker established the necessary and sufficient condi- 


me 


tions for X and to provide a — to B. 


Let $,= [4 )., and ¢, p [se hha 3) 


the necessary conditions take the form 
$20, per «0 X20 
where the second equation is the dot product of two vectors. 


Together with the above conditions, 


OA) £ R(K,A) + EO-*) 
Q(KA) = RX,A) + He(-A) 
X20 ,d20 
form the sufficient conditions for the solution of problem B. 


The Kuhn and Tucker "'Equilvalance Theovem"', which states that 


10 


problem A is equilvalent to problem B, has been the basis for important 
additional theoretical work in nonlinear programming, such as, "differen- 
tial'’ gradient methods of Arrow, Hurwicz, and ieee and the Duality 
Theorems. 

If problem A has a strictly convex objective function and there exist 


a point x in the feasible domain such that for each i 


g(x) 7 0. 
then the dual problem to problem A can be pened oe 


Problem C. 


1p 
Maximize G(X,A) =) Zs Xi WC) 
om 
Subject to [KZ f(x) = za Ki YO) Ai 2 0 


where \ is the gradient of the function with respect to the vector x. 
If either problem A or C has a finite solution the other does and more- 
over Minimum f(x) = Maximum G(% ,A ), and if x is a solution to 
problem A then x together with X is a solution to C. The Dual or 
complimentary variables can be interpreted with properties of the system, 
for example, if the original variables are such that their magnitudes 
increase proportionally to the system (mass, cost, etc.) then the dual 
variables magnitudes will be independent of the size of the system (pres- 


(oJ 
sure, price, etc.) and conversely. 


Hadiey ae porn.” [2] 


and Arrow are excellent references for the 
various theoretical and computational methods in current use to solve 
nonlinear problems. The methods will not be covered in this paper; how- 
ever, it should be pointed out that each method has advantages and dis- 
advantages for a particular type of problem which should be studied 


ll 


carefully prior to use. 


The field of nonlinear programming is young. Most of the theoreti- 
cal and computational work is less than 10 years old, and considerable 


work remains to be done. The age of the field shows in the lack of 


[8] 


literature on this subject. Fiacco and McCormack sum up the situa- 


tion very well in the following quotation: 


We have already deplored the general dearth of such information 
in the literature on nonlinear programming, which not only makes 
comparative analysis between practitioners exceedingly difficult, 
but makes it virtually impossible for a potential user to decide 
whether a given problem can be solved and, if it can, to estimate 
the required effort. 


ia 


3 Sequential Unconstrained Minimization Technique. 

In 1961, Cero! proposed that the constrained, concave program- 
ming problem; maximize f(x), subject to g(x) = O, i = 2am, x is an 
nedimensional vector, be modified to the unconstrained problem: 


mM Uw,’ 
Maximize P(x, fr ) = E(x) = GG 
f= 


with W/20and />0 
Carroll's unconstrained P function has several desirable properties; 
a) the summation term (called the penalty function) approaches 
— © if the boundary of a constraint is reached; 

b) if the functions f(x) and g, (x) have continuous first and 
second partial derivatives inside the feasible region then the well known 
necessary conditions for an unconstrained function to have a local maximum 
applies, that is; the gradient vanishes at that point and the matrix of 
second partials is negative peniedemtnaee els 

c) if f(x) is strictly concave or any g, (x) is strictly con- 
cave then the local maximum is the global maximun; 

d) first or second order gradient methods may be used to 
maximize P(x, f ); 

e) when a maximum of P(x, Ff ) is reached for a fixed value of 

r, Y can be reduced (77%57°°','’ > 0) and the new P(x, I ) solved 
for a maximum; 

f) as > O the maximum of P(x, , ) approaches the maximum 
of f(x) and the penalty term approaches zero. 

Carroll demonstrated that the computational method will solve convex 
programming problems. 
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[8] 


Fiacco and McCormick proved Carroll's conjecture for the corres- 
ponding convex programming problem. ‘Their work established the proof of 
the ‘following: 
Given the convex programming problem (Problem A) 
Minimize f(x) 
Subject to g, (x) = 0 il 2m 


(where x is a n-dimensional vector) 


Define the function: 
| 


PAG) ae Ve gi G) 
where [[/>/,7'°'*>%>°°40 
Define x(P ) as the vector minimizing P(x, i ), let Ve 
be the constrained minimum value of f(x). 
Then: 


L/M Pg) = Ve 


i 
The following additional conditions must hold. 
Cl RR: ={x] 90)>0 ae is not 
empty, denote by R the closure of R° 
C2: £(x) and “8, (x) are convex and twice continuously differ- 
entiable for x ER. 
C3: For every finiteh, £x] fae hx ER? 
is a bounded set. 
C4: For every [>0, P(x, 7 ) is strictly convex. 
Condition Cl is necessary for the method to apply since it is an 
inside method, condition C3 ensures that a local minimum is achieved at 
a finite point, and condition C4 is necessary to ensure that a local 


minimum is the global minimum. Condition C4 is satisfied if any of the 


following statements apply to the problem; 
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a) f(x) is strictly convex, 

b) there are n independent linear constraints on the problem 
(the constraints X; = 0i=1,...n are a special case). 

c) any constraint g, (x) is strictly concave. 


[8] 


Fiacco and McCormack showed that the manner in which the primal 
problem is solved yields a set of points that are dual-feasible, problem 
C, and approach the dual optimum in the limit as fpome QO. They further 
demonstrated that the solution to the primal and dual problems for a fixed 
value of [ , say f  , bound the final solution to the problem, that 


is: if x ( P ) is the optimum solution to P(x, %  ) and Vo is the 


optimum solution of the constrained f(x) then: 


GLP), AT EM € Slxew) = PLA), GJ 


This theoretical development provides a criterion for termination of 
the computational method. Fiacco and McCormick also proved that the 


optimum solution to the subproblem; (minimize P(x, /, )), successively 


decreases to Vo ; that is, for > fe) 7 70 tien 
PLACP) tenes x(n ree 
[9] 


Fiacco and McCormick with Mylander developed a computer routine 


to apply the Sequential Unconstrained Minimization Technique (SUMT) to - 


[5] 


convex programming problems, using a modification to Davidon's second 
order gradient method. SUMT program written in FORTRAN-4 machine language 
is available from the SHARE library under distribution number SDA 3189. 
The program is still in the experimental stage and the authors are in the 
process of rewriting it to increase its' efficiency and accuracy. 

The program as now written will handle up to 100 variables and 200 


constraints (including any restrictions on the variables). SUMT has 
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solved several nonconvex problems; however, since there is no theoretical 


justification to the convergence of a non-convex problem the user must in- 


terpret the answers carefully. 


An example might serve to illustrate the principles of SUMT. 


Minimize x) + x, 
Sub ject to x - i= 20 
= = 
x, 1 0 


Both the objective function and the constraints are linear and 


therefore are both convex and concave. The four conditions required by 


SUMT are met. 


Using the method of Lagrange: 
L(X,rA) = Wt ke tN AGH1) Ay (%2-1) 
Ary, Se NSO aa 
Ay, St), =O = jes 
Ap = Hrs 0 HSS 


dtp, ke Se 7 tat 


at the point (1,1), f(x) = 2 the actual minimum of the constrained 


problem. 


Using SUMT: 
Bae She —_ 
GS r) = a, te as + %-! t Ned 


the necessary condition for an unconstrained minima; 


hy, = / ~ GaP =— 0 aa, Xx = a ae 
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ee J oe 


Ye, = “an cane ie 


is not a feasible point, it is there- 


rc > O and l=+yr 


Since 
fore rejected as a possible solution, then 
ce 1 +77. x, = 1+Vr_ 
then: 
ee ct os | 
el! rm 0 
=/i/p 40) = Li Ltr =1 
[-» © rm-30 


$ (= LIN $0) = LIM 2 +207 =2 
ro [70 
The same minimum was achieved using the method of Lagrange. To demon- 


iterative procadure, 


strate th: 
let i = |: 
PUA, 1) = te +e tT 
X= /t/7 =2 W=/tV7F =2 
and the minimum of PLO, pal 6 
Lee Ye sb 
a 
P(X, W) = Ath + aa) 7 a) 
=1t}i=% w= 1+ zh 
and the minimum of Pl xe), y_] = 24 9 
and (LX) — PIX), 17, 
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[7,8] 


Fiacco further showed that a modification to the unconstrained 
method could be used to find a feasible starting point if one was not 
known. Given a starting point x° with at least one of the constraints 
not satisfied. 
We define index sets 
S = £ 5/0 29F 
3 
Taft / few >} 


Pick an element of S say § We then proceed to minimize 


1 
Poe) = p48 pea 
At each point during the minimization process the unsatisfied constraints 
are checked and in case gs(x) 7 0,5 €S, then S$ is shifted into T. The 
process is continued until a value of x is found for which gs (x) 7 0. 
The index S$ is shifted into T and if S is not empty another element of 
S is taken and the procedure is repeated until S is empty. 
SUMT has several desirable properties for solving nonlinear convex 
programming problems, they are: 
a) If a feasible starting point is not known the method can 
determine one. 
b) The method will determine a minimum if it is an interior 
point or if it lies on the boundary of a constraint. 
c) The amount of computer time needed to find a minima is 
compatible with other methods in current use. 
d) The solution to the dual and the Lagrange multipliers 
yield desirable additional information. 
SUMT however is not without disadvantages, they are; 
a) In highly nonlinear problems it is frequently difficult to 


determine if the objective function (or the constraints) is convex 


18 


(concave) in the region of interest, and also if the interior is a 
connected region. This is true of all nonlinear methods. 

b) If the solution lies on a boundary it will be impossible 
to get an exact solution, and a high price (computer time) will be paid 
if it is desired to get very close to the solution. 

c) The initial value of ” and a method of reducing F , at 
each subproblem minimization, has not been thoroughly investigated, 
although a method has been found that works reasonably well in practice. 

This paper will investigate a modified penalty function, which 
preserves the desirable features of SUMT and improves the undesirably 


slow convergence. 


19 


4. Modification of the penalty function of SUMT 

The idea of converting a constrained function to an unconstrained 
problem whose solution approaches the constrained solution is not new. 
The conditions imposed on the constrained problem provide an unconstrained 
function that can be minimized by existing methods. Can SUMT be modified 
to increase its rate of convergence without effecting its desirable fea- 
tures? What properties must an unconstrained function have in order that 
it converge to the minimum of the constrained problem? 

Intuitively the unconstrained problem should remain convex, in order 
that we may use calculus methods to find the minimum. The unconstrained 
problem must have some "built-in'' method of remaining inside the feasible 
region and it should be monotonically related to the objective function 
and the constraints. If an iterative procedure is to be used for solv- 
ing the unconstrained problem, then each iteration should yield a point 
that successively minimizes the objective function, (that is, f(x(P ))2 
f(x( P + 1)) for P integer > 0). And most important, the minimum of 
the sequence of unconstrained functions must equal the minimum of the 
constrained objective function. 

SUMT was modified, by the author of this paper, for FORTRAN-63 and 
the CDC 1604 computer of the U. S. Naval Postgraduate School. The ac- 
curacy of the conversion was tested using the sample problem furnished 


[9] 


with the SHARE library program Several linear and non-linear convex 
programming problems with mimima occurring on the boundary of one or more 
constraints were tested. In all cases SUMT moved close to the solution 
expeditiously; however, once close to a boundary,movement in the direc- 


tion of mimima slowed and excessive computer time was required to achieve 


the desired accuracy. 
20 


For example,in the following problem: 
2 
Minimize f(x) = xy + (x, — 4) 
Subject to: 


l. 2x. +x. — 6 = 0 


l 2 
—] z= 0 
Zz x) l 
al 
ce x5 0 


The solution is at (1,4) and Min f(x) = l. 


Subproblem Cumulative 
Number x) x, f[x(p) ] P(x), 4) in Computer time, 
Seconds* 

0 22.0 21 = : : 0 

1 1.37 3.38 es 17.5 137 : 

2 1.219 3.80 1.52 4.97 34 .716 

3 1.167 4.084 Hing 2.206 086 1.583 

4 1.109 4.100 esa 1.53 .021 INF 

5 1.054 4.09 es 1.25 5.3x10°2 3.9 

6 ieO2 4.07 1.05 ta 1.3x107° 5.3 

7 1.012 4.059 1.02 1.06 3.3nth 6566 

8 1.006 4.04 1.014 1.029 8.4x10°> 8.050 


16 1.0000 4.001 1.00 1.0001 wy DOR 20.433 


*Not including compiler time. 


It is obvious that we are close to the solution at the sixth sub- 
problem minimum. At that time the values of the constraints are: 8, (x) = 
.096, 8, (x) = .02, and 8, (x) = 4.07. The first and second constraints 
are binding and in the direction of mimima they will become even smaller 
causing an increase in the P function that nullifies the decrease in /. 


The rate of movement towards the mimima is reduced as the boundary is 
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approached. 

One obvious way to increase the rate of convergence of SUMT is to 
modify the penalty term so that it's rate of increase, as a boundary is 
approached, is reduced. The modification must not change any of the 
desired features of SUMT. The following P function retains all the 
necessary properties of SUMT, and for v < 1, buffers the rate of increase 
of the penalty term. i 

Plyrvl = §G) + > (si) 
jet 

for O<v<l, re. 0. 

It will be shown that P(x, r, v) has all the properties of P(x,r) 
and that P(x,r) is but a special case of P(x, r, v), that is with v = 1. 
P(x, r, v), with v < 1, buffers the rate of increase of the binding con- 


straints as the following table demonstrates: 


g. (x) (1/g, ())” v=1 v= .5 ee) ex. 1 
1 1 1 1 1 1 
5 2 2 1.41 1.32 1.07 
25 4, 4 2 1.74 1.15 
1 10 10 3.16 2.51 1.26 


4.1 We restate the convex nonlinear programming problem in its new form: 
Minimize f(x) 
Subject to g. (x) =e = Le eet 
Define the function: - 
] 
Zipp Fonee bakes 
PL, 4, = 5) + 4. (ra) 
es 
In addition the following conditions hold: 
_ re) ae Sis . 
Cie R { x| g(x) > 0, i Lata is 


not empty. Denote py R the ciosure of R° 


ZZ 


C2: EG) and —g. (x) are convex and twice continuously 
differentiable for x € R. 
C3: for every finite h, i x | f(x) — hp xe R} 
is a bounded set. 
C4: for every r > 0 and - yy >0O, P[x,r,v] 
is strictly convex, 
C5: ti > T,7 +++Ty> re cc, 
Define: x( P ) as the point that minimizes P(x, rm ,v). 
Lemma 1 
If g(x) is concave then r[1/g(x)]” is convex for 
V- 0,°r >a) ancdew € RR: 
Proof; 


The matrix of second partials of r({1/g(x)]” 


vir . SI) J 92) 


“ru, of 
(gap? £1 5% SX, 


AT" hid || 7 





























oe J (Fa). 


duly 





for x € R°, g(x) > 0, r > 0,and v > O,then the terms, 


Ton” and Tent are positive 


and can be factored out of their respective matrices on the right. 


= = 96 (x) tie f 4) , AK) 
oo Pr im = [ San” AK, i f gay oft) LY, ; 
































since g(x) is concave by condition C2, 


23 


9) 9@) 


by; Sky 


is negative semi-definite 





therefore | ~ 504) 
dat) 








is positive semi-definite. 








The second matrix on the right is of rank 1, or less, (the rank of 
a matrix is defined as the order of the largest non-vanishing determin- 


ant), since all determinants of order 2 or greater are equal to zero. 


PEROOr? 
A GX) , J Il) a 91) 19) 
A x, 1 ys Xr A) Xk 
SI”, JIG) (8) Lote) =(Q for all i, j, kK, 
dy xy Jan lo 


Multiplying on the left by YT and on the right by ’ , where Y is an 


arbitrary n-dimensional vector not equal to the null vector, yields the 


toy 


following Quadratic form: 


(Gn) Eh” 


HP cy y 





— )* a(¢) 
dl, J X/ 





y" || a 





y = [ayy y" 

















vivre U a 9) ; / 90) 
T ta va IK oh XY y 
ZO 
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For all values of the vector y » except y = Q. the first mattix 
on the right is positive semi-definite and therefore it's quadratic form 
is greater than or equal to zero for all values of Y . The second matrix 
on the right is of rank 1, hence the quadratic form is factorable. newman >! 
(pages 111-112) proves that the quadratic form 

EZ ay Ky = (L6%)(E 44) 

a! / y 
if the rank of Hf ayff is one. 


Therefore 


"|| da. de 

















azemeee VtDP > > d/9tt) Jylt) y, y- 
" y. 


[omn]” 1% Coc dy eK; 
v(rrpr got) D yy dg y, 
ae 9(r) x ol 2, dey) [S dan) 
2 

= Veter [S$ dated y: 
tc (daly) 

pale S(O, 

For all values of Me ; = 


Hence by definition r[{1/g(x)]” is convex. This completes the proof 


of Lemma l. 


The sum of convex functions is a convex function, hence P(x, r, v) is 


convex. 


At this point another condition is imposed on the basic non-linear 


convex programming problem. 


Condition C6; ‘ :.* The greatest lower bound of f(x), x € R 
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is finite 1fesper(x) == yo = CO) 
Lemma 2: Under conditions Cl thru C6 P(x, r, v) is bounded 
below for x € R° and any ro 0, v> 0. 


Proof: 
wt oe, vu" 
Pix 1,7] = F(z) + (joa) 
> MIN §@) 2>Ver-o 
al 


(R° © R, condition Cl and C6) 
Define: x° as the interior point at which minimization begins. 
Lemma 3: a) Any local minimum of P(x, r, v) is in R®° and is 
finite. 
b) At least one such point exists. 
Proof: Condition Cl, (R° is not empty) is sufficient for the 
existence of a point x°. Let r° be any value of r >0. 
Define [lo = rx Gr gv) = oS 
(Lemma 2 and v > 0) 

For any boundary point ne : g(x") = 0 for some i,(condition C2) 
hence the P function is not defined at 76 and any local minimum must be 
an element of R°. 

It is then possible to form the sets 

S, = £«/$0) = Mo, x€ RF 
and S; = fa] Zia) MWe, xe RF 
pee fj eel 


Note: 56 and Si i= 1,...m are closed. 


Let SS AVS. 


jre 
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For any point iy € R and y 3 S, either 


F(y) > Me or r*S bigy) > IM, — Vo (for some i) 
if Sty) >Mo then; 


Ply srw] = $y) °EGg) > Sqy) > Mo 


Te jo 5 (ae) > Mm. - Ve then, 


PlyryeI = $6) t°E (Hy) 


a Vo t (Mo- Ve) = Vo 


By the definition of a local minimum, any local minimum of P(x, r°, v) 
must be in S, (if it exists). By construction S is non-empty (x° € S), 
and is closed and bounded, (condition C3 insures that S is bounded). 
Hence, part avis proved. 

P(x, r°, v) is continuous on a compact set S and hence assumes a 
global minimum in S. This implies the existence of a local minimum in R. 
This proves part b. 

Theorem 1: Subject to conditions C1-C6 the function P(x,r,v) 
has at least one local minimum x(r) € R°. 
Furthermore: 
a) x(r) is finite 
b) YF) hv] =0 
c) 


definite matrix. 


! PI Aaa 
215; dx; 





is a positive 


ra 


Proof: 


a) x(r) is finite element of R° by Lemma 3. 

b) P(x(r), r, v) = 0 is the necessary condition for 
a minimum at x(r). 

c) the matrix of second partials is positive definite 
throughout the feasible region by definition of a 
strictly convex function. 

Lemma 4. There is at most one local minimum of P(x, r, v) for any 
r > Q, and a fixed value of v > Q. 
Proof by contradiction: Assume there exist two points x (r) 
and eS for which P(x, r, v) has a local minimum value in 


R. By condition C4, P(x, r, v) is strictly convex, then; 


PE Lanaieariger] < O-N PLO, cd +d Plat), no] 
since x/(r) is assumed to be a local minimum 
Ge) anon PL to-axte) axe, rir J 
NN) oy NE Oe ee 

Transposing and collecting terms 

(1) API xO r v1 <P EHO) Hv] 
Using the same procedure with (/-A)X(r) t AX(r) yields 

(2) NPE ute) rv] < XPLXC) 77 


clearly both equations 1 and 2 are impossible. 


Now consider the iterative method of minimizing the P(x, r, v) 
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function. Starting from an interior point x° and with a fixed value of 
v > O and a value of r > O, minimize P(x, r)> v). Such a minimum exists 


by theorem 1 and is unique by Lemma 4. Then reduce r, i.e., r, < r, and 


2 1 
minimize P(x, Tos v) using the point x(1), (minimum of P(x, ty» v), as a 
starting point, etc. The final steps of the proof will be to show that 
such an iterative method converges to the solution of the initial non- 


linear convex programming problem. 


Lemma 5. For /, ? mei) > 0, 


PL x(e), 4] > Prat), ty, 7 


Proof: 


pee ile NS 
Plum, 5, v7 = $(xH] + 2 = bom) 


\ 


> $(xty) + nu (aba) 


ee en 


ze P LxC+r) ha vd 


since the point x( P ) is the starting point for minimization of 


P(x, 54 D v). 
Theorem 2; Basic convergence theorem 
Under conditions Cl to C6, the values P(x(1), ry a 
P(x(P ), tee v)..-. approach the solution value, Vy, , of the convex 


programming problem as f-> 0 ( P>—~—? ©) i-e., 


Cr, —p © 
fa a> 00 


Lim [nin Pligg, rv] = \V/ 
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Proofs 


Let € > 0 be chosen. The consider X such that X é Rs 


and f£(X )< V+ £- 


Select f from Ticeee G e+e. such that 
G) Or ae ) Zee, 
sa) 2m 
Pugs 


p 4 ( P >P) 


Then for 


eae P(x, Ths v) exists (Theorem 3 
tek 
Ae iat (Lemma 4) 
(Lemma 5) 


< Pl O,7,v1 
St Lemay a 


- 50) + 52 (em) 


ae ce 


. wté 


since = PL x0), vr] > Flac] 2 Vo > YE 
then 


V,-~€ < PIX) aye] < Vere 


This completes the proof of the convergence of P(x,r,v) to the minimum 


of the convex programming problem. 


Lemma 6. 


a) oe) ne (5 a) = 0 
» LLM SEaeey] V 
p< 06 
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Proof; The following inequalities are true. 


(1) I [xrea] t Tha San) Z jlxol] + aE boa) 


1 | i 
and (2) J Lx¢e/] + in E (soa) £ S[x(rep] +o Z (sha) 


Adding and transposing: (5u-)& (00a) £ faa EZ (ran) 


Since Ca ge 3 (0 then 


(3) (a ona) > a Gatien) 


‘ 
/ 


from (1), Hate — FLxen] Eton (Z Gitzo) ~ 2 (rie) ) 





From (3) the right hand side is negative which implies; 


(4) Fpecml= Flac] 


is a decreasing sequence bounded below, (condition C6), by Yo . 


Therefore, L/Mm £Fx(o)] aye 
and 
LA [5.1 s= LIM m 5a 
p— vo Ome ‘ 
Vo LIM iam] 2 0 
since 


yoy 
+S (rei) =O 
i 


Lin Frum] = Y, 


rpm 0 


which implies; 
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therefore, 


Lim «5 apog) = 0 


pb =~ 0f 


This proves Lemma 6. 


For convenience we restate the dual of the convex programming 


problem: 


Maximize G[x,r] = Fo) SW HE) 


Subject to: V, 5) = 3 \,’ Va 


hy 20 


Theorem 3: Under conditions Cl thru C6 the method yields points 


(XH), MO), hi = +4 (Gin) 


that are dual feasible, and values of GTX), A(ry with, 


L 1M Gx, M] = Ve 


paw 0 
pm ov 


Proof: 


From theorem 1.b, the gradient of P(x(F), Ty, ,v) is equal 


to zero, that is: 


K PIX, 5 Yr 


which implies: 


vt! 


Hy) 


V, $[xtr)] ~ vA EY WT 10 saa 


ii 


| vtl 
Y Siem] EY SHOU rag), 


oi 


+v +l 


8 (Figea) 


Choosing hw) 
} 


Then 


"i 


VG Slre) = ZN, ped, 


Thus, at any P minimum, the dual side constraints are satisfied by 
[xX(r), ACh) d 
From Lemma 6, //” ${¥(y] = Vo as 4.70 and 
\ wv 
Eas E (‘/pitr en) = 0 then for an €>0 an (ye) 


can be found such that for % < hw) 


Vv. £ Flu <VUreé 


(oe v 
and iets % & (am00) <0 
Adding yields V-€ < $Ray ~ h E (za) < wt 
ae 
Thus, G xc), A] = § [x0] -4F (geen) 


SV mae ae, mel 


Therefore for any /} minimum the solution ve to the convex program- 
ming problem is bounded above, by P(x( b> ), rp , v), and below, by 
G ee) NEL, ‘This result provides a criterion for termina- 
tion of the method. 

The only requirement on v, for the proof of convergence of the P(x, 
r, v) function, is v > 0. The P(x, r) function of SUMT is equal to 
the P(x, r, v) function for v = 1, therefore the unconstrained function 


of SUMT is a special case of the P(x, r, v) function. 
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Die Behavior of the Modified SUMT as a function of v. 

Up to now very little has been said as to what the introduction of 
the new variable v would accomplish. The original idea behind the develop- 
ment of the P(x, r, v) function was to increase the rate of convergence 
of SUMT for problems with solution on the boundary. Will v > O and not 
equal to one increase the rate of convergence of SUMT? If so, what is 
the optimum value of v? The answers to those questions are quite compli- 
cated for the general case. The behavior of the modified SUMT for a simple 
function will be described below. 

The following example demonstrates the effect of v on the P function 
minima. 

Minimize x, + x 


i 
Subject to x 


RO 


> 
sas : 


X- 20 


2 
The unconstrained problem is 
af* 
é val ae 
Minimize P(x, r, v) = x, + x, + 1(> +r(z 


By theorem 1, the gradient of P is the null vector at a minimun, 


for a fixed value of r > O andv > Q. This implies 
r i 
cS Vv a 
xy ( X, 
The minimum occurs at (0,0) with Vp. = 0. Table I lists the 
minimum points and values of P(x, r, v) for a fixed value of v and a 
reduction of r at each step. It is obvious from Table I, that the 


smallest value of v, i.e., v = .125, produces the largest reduction in the 


P function and values of x) and Xo that are closest to the desired value. 


[9] 


The SUMT computing program was modified for the P(x, r, v) func- 


tion. The sample program above was used for a test of the procedure. 
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v= 2 


~25 
oy 


.0625 


PZ 
25 


.0625 


~25 
22 


.0625 


2) 
129 


0625 


~125 
.0625 


a 
1.31 
1.09 

93 


788 


705° 


63 


41 


035 
5 45, 
92 
37 


48 


025 
- 16 
. 10 


TABLE 1 
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=o 

1.31 

1.09 
.93 
.788 
.65 


1.26 


~705 


63 


oT 


~35 
22 
92 
ey | 
48 
»29 
172 


. 64 


»25 
16 
.10 


VALUES OF THE P FUNCTION FOR DIFFERENT VALUES OF F AND Ve 


PCxee 
3.5 


va 


2.078 


1.74 


3.78 


Zeek 


1.86 


2.83 


1.40 


3.94 


2.64 


804 


2.4 


ioe 
94 


Vv 
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Ls 


m4, 
es 


.0625 


pee 
:125 


.0625 


7o 
»19 
. 108 
.06 
.035 
29 
.085 
045 
.025 


nos 
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5, 
719 
. 108 
.06 
.035 
219 
.085 
.045 
.025 


BOS 


P(X EV) 


1.086 
=62 


37 


82 
»45 


. 226 


Table II tabulates the results achieved. The solution of the sample 
problem using SUMT is also shown. The time difference between the modi- 
fied version of SUMT with v = 1 and the unmodified SUMT is due to the 
additional computation required for the exponent v. As expected, the time 
required for solution reduces as v is reduced. If SUMT is used as a base 
for comparisons then v < .75 is required (for this problem) to produce 

a reduction in computational time. Although this is a simple problem it 
serves to illustrate the reduction of computer time expected for a con- 
vex programming problem, whose solution lies on the boundary of one or 
more constraints. 


If the value of v is too small (close to zero) the penalty term tends 


[nS (gq) = 6 


Thus v must be greater than zero. The effect of v, (in the interval (0, 


to a constant, that is, 


Lill 


1)), on the penalty function varies with each constraint. For those 
constraints whose values are greater than one, (reciprocal less than one), 
v increases their values and effect on the penalty function. For those 
constraints whose values are less than one, (reciprocal greater than one), 
v decreases their values and effect on the penalty function. Thus the 
effect of v on the penalty term, the gradient of P, and the matrix of 
second partials is directly related to the values of the constraints. 


[5,8] 


SUMT, uses a second order gradient method to minimize P(x, r, 


= X OY Lhe | 


Where G is determined by a search ae to minimize the function 


v), given by 


PLA ma 





P(x, r, v) along the modified gradient. Then the process is repeated, 
starting from the point me until a minimum of the P(x, r, v) is achieved. 
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TABLE 2 


MODIFIED SUMT SOLUTIONS OF THE SAMPLE PROBLEM 


Minimum Computer 

x) Xy P(x,r,v) time in seconds 

19.E-05* 16.E-05 43.E-05 44.066 

12.E-05 15.E-05 37.E-05 34.433 

41.E-06 76.E-06 18.E-05 24.76 
75 57.E-06 27.E-06 14.E-05 A a 

54 .E-06 23 .E-06 15.E-05 18 .666 
«20 84 .E-07 11.E-06 88 .E-06 POrZ ly 
.125 44,E-07 46.E-07 80.E-06 14.434 
0625 21.E-07 68.E-07 75.E-06 13.584 
.02 34 .E-08 54.E-07 85.E-06 12.650 
yOL 24 .E-08 38 .E-08 71.E-06 12.467 
.005 16.E-08 16.E-08 66.E-06 12.484 
.002 30.E-08 59.E-09 63.E-06 P2.700 
.001 33.E-09 17 .E-09 62.E-06 12.350 

SUMT solution (unmodified) 
43.E-06 43.E-06 17.E-05 20.05 
*Note: 19.E-05 = 19.x 10° = .00019 
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The criteria used to terminate the procedure is that the magnitude of 
the gradient be less than some pre-assigned small positive number, i.e., 
Plan vd| <é 

The parameter v < 1 affects the gradient by the factor v and by the 
exponent v + 1 which is less than the exponent 2 of SUMT. Also the matrix 
of second partials is affected by the factor v(v + 1) and the exponent 
v + 2 which is less than the exponent 3 of SUMT, (Section 4). 

Since it is almost impossible to determine a priori whether the 
solution point is interior or on the boundary of the feasible region, the 
parameter v, in the interval (0,1), should also reduce the computational 
time for problems with interior solutions. 

As explained above, v, in the interval (0,1), reduces the effect of 
the constraint on the penalty function and on the second order gradient 
method. For a problem with an interior point solution, this is a desir- 
able feature. If we knew ahead of time that the solution was an interior 
point, then we could eliminate the constraints and minimize f(x). There- 
fore v < 1 should serve to accelerate the rate of convergence of convex 


programming problems with interior solutions. 


Shee 


6. Conclusions and Acknowledgements. 


[11] demonstrated that the SUMT 


Fiacco, McCormack, and Mylander 
program converges to the solution of a convex programming problem, and 
that their method is as efficient as any other method in current use. 

The proof of convergence of the P(x, r, v) function for v > 0 is shown in 
Section 4. For v in the interval (0,1) the rate of convergence is ac- 
celerated for a sample problem. This increase in the rate of convergence 
is not expected to be as pronounced for all classes of nonlinear problems; 
however, it is expected that O< v < 1 will reduce the computational 
time of SUMT for nonlinear problems with solutions on one or more con- 
straint boundaries. And it is expected that the introduction of v will 
not increase the time required for a problem with an interior solution. 

I would like to express my gratitude for the inspiration, encourage- 


ment, and guidance which Associate Professor Uno R. Kodres has provided 


throughout the preparation of this paper. 
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